Skip to content

Implemented the possibility to load predictions from details files and continue evaluating from there #488

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Jan 29, 2025

Conversation

JoelNiklaus
Copy link
Contributor

Fixes #467.

@clefourrier
Copy link
Member

This is very nice, thanks!

We'll still have to add a saving step at the end of an evaluation type. For example, saving at the end of loglikelihood evals, or generative ones. (We should be able to safely assume that once the first item of a given type's batch runs, the rest will run fine too since they are sorted by inverse length)

@HuggingFaceDocBuilderDev
Copy link
Collaborator

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@JoelNiklaus
Copy link
Contributor Author

Sorry, I am not sure I fully follow. Please let me know what changes are necessary.

Copy link
Member

@clefourrier clefourrier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@NathanHB NathanHB merged commit 94fc5a2 into huggingface:main Jan 29, 2025
3 checks passed
hynky1999 pushed a commit that referenced this pull request May 22, 2025
…d continue evaluating from there (#488)

* Implemented the possibility to load predictions from details files and continue evaluating from there.

* Run model as fallback when no details can be loaded.

* Improved loading speed and added more useful error messages.

* Fixed typo.

* Fixed gnarly bug with details loading to prevent loading too many examples.

* Unpacking predictions to fix issue with weirdly saved predictions.

* Made bulk loading easier by also allowing first timestamp more generally.

* Made loading details more robust against tensors being saved in the details files.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FT] Rerun evaluations with new metrics based on completions saved in details file
4 participants